This README describes the replication file for Improving Compliance in Experimental Studies of Discrimination by Aaron Kaufman, Christopher Celaya, and Jacob Grumbach in the Journal of Politics.


#### File Descriptions
There are only five files other than this one:

1) Compliance survey_September 6, 2022_03.39.csv

This file is a Qualtrics export containing the raw survey data described in Section 2 of the paper.

2) prolific_export_6304a475d4f1129467c0dd0c.csv

This file contains respondent-level covariates to be matched to the Qualtrics export file.

3) compliance_replication.R

This is one of two R scripts. It reads in the above two files and produces Figure 1, Table 1, Table A1, Figure A8, Figure A10, and Figure A11.

4) compliance_simulations.R

This is the second of two R scripts, which conducts the simulations described in Section 3. It produces Figure 2 and Figure A9, as well as the temporary file compliance_simulations.RData.

5) compliance_simulations.RData

This file stores the simulation results from compliance_simulations.R so that users do not need to run the entire simulation procedure to replicate our results.


#### Software and Hardware
All analysis was done in R 4.3.2 with the seed set. It is possible to perform on a standard laptop. For the simulations, we recommend using a server or cluster instead; our simulations were run on a server with 512 GB of RAM and 64 cores, though that is far in excess of what is actually needed. The two R scripts can be run in either order.


#### Codebook

Lines 19-33 of compliance_replication.R produces our main data set, called dat_race. It contains 9 variables.
1) Prolific ID is an anonymized identifier used to match Prolific covariates to their Qualtrics responses
2) name is a treatment condition identifier. The first letter indicates the race of the assigned profile; the second letter indicates the sex, and the next one to three indicate the presence or absence of identity cues: picture, name, and resume.
3) value is the key outcome -- the respondent's guess about the race of the resume they were shown.
4) truerace, truesex, and the three treat_* variables all constitute the name variable
5) race_correct indciates whether the value variable is the same as teh truerace variable.

Line 94 merges in the respondent-level data from prolific, and we add four more variables: Age, Sex, Ethnicity.simplified, and Employment.status. We use these to perform regression adjustment.

Lines 127-138 produce dat_class, whichparallels dat_race. It contains the same variables, except (3) value is the respondent's guess about the class of the resume they were shown.

Lines 148-159 produce dat_hire, which is again the as the above except the outcome is whether the respondent would interview the assigned resume.